Some experiments on iterative reconstruction of speech from STFT phase and magnitude spectra
نویسندگان
چکیده
In our earlier work, we have measured human intelligibility of stimuli reconstructed either from the short-time magnitude spectra or short-time phase spectra of a speech signal. We demonstrated that, even for small analysis window durations of 20-40 ms (of relevance to automatic speech recognition), the short-time phase spectrum can contribute to speech intelligibility as much as the short-time magnitude spectrum. Reconstruction was performed by overlap-addition of modified short-time segments, where each segment had either the magnitude or the phase of the corresponding original speech segment. In this paper, we employ an iterative framework for signal reconstruction. With this framework, we see that a signal can be reconstructed to within a scale factor when only phase is known, while this is not the case for magnitude. The magnitude must be accompanied by sign information (i.e., one bit of phase information) for unique reconstruction. In the absence of all magnitude information, we explore how much phase information is required for intelligible signal reconstruction. We observe that (i) intelligible signal reconstruction (albeit noisy) is possible from knowledge of only the phase sign information, and (ii) when both time and frequency derivatives of phase are known, adequate information is available for intelligible signal reconstruction. In the absence of either derivative, an unintelligible signal results.
منابع مشابه
Iterative reconstruction of speech from short-time Fourier transform phase and magnitude spectra
In this paper, we consider the topic of iterative, one dimensional, signal reconstruction (specifically speech signals) from the magnitude spectrum and the phase spectrum. While this topic has been extensively researched and documented, we wish to recast some well-established results for the benefit of new researchers and those who desire a short, yet comprehensive, review of the subject. The t...
متن کاملAnalysis of signal reconstruction after modulation filtering
When the short-time Fourier transform (STFT) of an audio signal is arbitrarily modified, it no longer truly represents a time-domain signal. Classically, the accepted solution to obtain a time-domain signal from a modified STFT (MSTFT) is to invert the MSTFT to a time-domain signal that has an STFT that is closest to the MSTFT in a least squares sense. This is also the approach currently taken ...
متن کاملOn the usefulness of STFT phase spectrum in human listening tests
The short-time Fourier transform (STFT) of a speech signal has two components: the magnitude spectrum and the phase spectrum. In this paper, the relative importance of short-time magnitude and phase spectra for speech perception is investigated. Human perception experiments are conducted to measure intelligibility of speech stimuli synthesized either from magnitude spectra or phase spectra. It ...
متن کاملRole of phase estimation in speech enhancement
Typical speech enhancement algorithms that operate in the Fourier domain only modify the magnitude component. It is commonly understood that the phase component is perceptually unimportant, and thus, it is passed directly to the output. In recent intelligibility experiments, it has been reported that the Short-Time Fourier Transform (STFT) phase spectrum can provide significant intelligibility ...
متن کاملSignal estimation from modified short-time Fourier transform
In this paper, we present an algorithm to estimate a signal from its modified short-time Fourier transform (STFT). This algorithm is computationally simple and is obtained by minimizing the mean squared error between the STFT of the estimated signal and the modified STFT. Using this algorithm, we also develop an iterative algorithm to estimate a signal from its modified STFT magnitude. The iter...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005